What you see may not be what you get: a brief, nontechnical introduction to overfitting in regression-type models.
نویسنده
چکیده
OBJECTIVE Statistical models, such as linear or logistic regression or survival analysis, are frequently used as a means to answer scientific questions in psychosomatic research. Many who use these techniques, however, apparently fail to appreciate fully the problem of overfitting, ie, capitalizing on the idiosyncrasies of the sample at hand. Overfitted models will fail to replicate in future samples, thus creating considerable uncertainty about the scientific merit of the finding. The present article is a nontechnical discussion of the concept of overfitting and is intended to be accessible to readers with varying levels of statistical expertise. The notion of overfitting is presented in terms of asking too much from the available data. Given a certain number of observations in a data set, there is an upper limit to the complexity of the model that can be derived with any acceptable degree of uncertainty. Complexity arises as a function of the number of degrees of freedom expended (the number of predictors including complex terms such as interactions and nonlinear terms) against the same data set during any stage of the data analysis. Theoretical and empirical evidence--with a special focus on the results of computer simulation studies--is presented to demonstrate the practical consequences of overfitting with respect to scientific inference. Three common practices--automated variable selection, pretesting of candidate predictors, and dichotomization of continuous variables--are shown to pose a considerable risk for spurious findings in models. The dilemma between overfitting and exploring candidate confounders is also discussed. Alternative means of guarding against overfitting are discussed, including variable aggregation and the fixing of coefficients a priori. Techniques that account and correct for complexity, including shrinkage and penalization, also are introduced.
منابع مشابه
Eliminating plaque angiogenesis.
incident depressive episodes after myocardial infarction are associated with new cardiovascular events. J Am Coll Cardiol 2006;48:2204–8. 6. Grace SL, Abbey SE, Kapral MK, Fang J, Nolan RP, Stewart DE. Effect of depression on five-year mortality after an acute coronary syndrome. Am J Cardiol 2005;96:1179–85. 7. Lesperance F, Frasure-Smith N, Talajic M, Bourassa MG. Five-year risk of cardiac mor...
متن کاملP14: How to Find a Talent?
Talents may be artistic or technical, mental or physical, personal or social. You can be a talented introvert or a talented extrovert. Learning to look for your talents in the right places and building those talents into skills and abilities might take some work, but going about it creatively will let you explore your natural abilities and find your innate talents. You’re not going to fin...
متن کاملIt Ain’t What You Do (But the Way That You Do It): Will Safety II Transform the Way We Do Patient Safety; Comment on “False Dawns and New Horizons in Patient Safety Research and Practice”
Mannion and Braithwaite outline a new paradigm for studying and improving patient safety – Safety II. In this response, I argue that Safety I should not be dismissed simply because the safety management strategies that are developed and enacted in the name of Safety I are not always true to the original philosophy of ‘systems thinking.’
متن کاملجستاری میان رشته ای در اصول طراحی اتاق خواب با استناد به آموزه های نَقلی مکتب اسلام
The Holy Quran and hadiths as the most important Islamic religious texts have significant capacities in various fields of human knowledge which have been often neglected. Among the issues raised in this religious texts are some of architectural problems. One of these problems is quality of bedroom design. In this article, this problem will be probed by citing verses and hadiths to deduct bedroo...
متن کاملقانون طلایی تدارک حمایت از دانش آموزان با نیازهای ویژه در کلاسهای فراگیر: از دیگران آنطور حمایت کنید که دوست دارید از شما حمایت کنند
Consider for a moment that the school system paid someone to be with you supporting you 8 hours a day, 5 days a week. Now, imagine that you had no say over who that support person was or how she or he supported you. Or imagine that someone regularly stopped into your place of employment to provide you with one-on-one support. This person was present for all your interactions, escorted you to th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Psychosomatic medicine
دوره 66 3 شماره
صفحات -
تاریخ انتشار 2004